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Abstract.  Hybrid  Eulerian-Lagrangian  semi-implicit  time-integrators  (HELSI)  are  presented  which 
use  the  standard  semi-implicit  formulation  as  their  starting  point.  The  advantage  of  such  an  ap¬ 
proach  is  that  existing  models  which  employ  Eulerian  semi-implicit  time-integrators  can  easily  be 
augmented  to  use  the  HELSI  approach,  that  is,  provided  that  either:  a  series  of  advection  problems 
can  be  solved  efficiently  as  in  the  operator-integration-factor  splitting  (OIFS)  method  or  that  the 
grid  and  underlying  numerics  allow  the  construction  of  efficient  search  and  interpolation  stencils 
typically  required  of  semi-Lagrangian  methods.  Once  the  advection  problem  is  solved  the  approach 
then  follows  the  standard  semi-implicit  approach.  An  important  caveat  of  this  approach  is  that  the 
semi-implicit  time- integrator  must  be  constructed  around  the  implicit  backward  difference  formulas. 
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1  Introduction 

Although  explicit  time-integrators  are  the  easiest  methods  to  implement  their  main  disadvantage 
is  that  small  time  steps  must  be  observed  in  order  to  maintain  stability.  In  atmospheric  modeling, 
the  reason  for  this  prohibitively  small  time- step  is  due  to  the  fast  moving  gravity  waves.  These 
waves  require  a  small  time-step  while  only  carrying  a  very  small  percent  of  the  total  energy  in 
the  system.  In  order  to  ameliorate  this  rather  stringent  time-step  restriction  researchers  have  tried 
various  approaches  such  as  using  a  larger  differencing  stencil  for  the  gravity  wave  terms  thereby 
effectively  reducing  the  Courant  number  [1,  2],  and  discretizing  the  gravity  wave  terms  implicitly  in 
time  [3] .  It  turns  out  that  discretizing  the  gravity  wave  terms  implicitly  in  time  is  the  more  effective 
way  of  increasing  the  time-step. 

After  the  gravity  wave  terms  have  been  successfully  discretized  the  next  set  of  terms  responsible 
for  controlling  the  maximum  time-step  are  the  Rossby  waves  (advection  terms).  In  order  to  use 
increasingly  larger  time  steps  researchers  have  turned  to  Lagrangian  methods  for  treating  these 
recalcitrant  terms  [4,  5].  By  rewriting  the  equations  in  terms  of  the  Lagrangian  derivative  the 
troublesome  advection  terms  are  absorbed  into  the  Lagrangian  derivative.  Thus  the  equations  in  this 
form  are  now  discretized  in  time  along  characteristics  which  results  in  a  much  more  stable  numerical 
method  due  to  the  virtual  disappearance  of  the  Courant-Friedrichs-Lewy  (CFL)  condition. 

In  the  current  paper  we  follow  the  approach  put  forth  by  Maday  et  al.  [6]  which  they  called 
operator-integration-factor  splitting  or  OIFS  for  short.  The  approach  that  we  present  in  the  current 
work  uses  the  idea  of  an  integration  factor  and  the  OIFS  approach  to  present  a  unified  view  of 
semi-Lagrangian  and  OIFS  methods.  The  idea  of  viewing  SL  and  OIFS  as  similar  methods  is  not 
new.  In  fact,  Maday  et  al.  [6]  showed  that  in  the  OIFS  method  the  advection  integrating  factor  is 
an  approximation  to  the  Lagrangian  derivative  while  Boyd  [7]  refers  to  SL  as  an  integration  factor 
method  and  Xiu  et  al.  [8]  compare  the  two  methods  for  the  incompressible  Navier-Stokes  equations. 
Two  recently  proposed  approaches  for  the  shallow  water  equations  (SWE)  that  deserve  mention 
are  the  high-order  SL  method  of  Giraldo  et  al.  [9]  and  the  OIFS  method  of  St-Cyr  and  Thomas 
[10].  Although  these  two  approaches  have  shown  to  be  quite  powerful,  they  are  particularly  geared 
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for  the  SWE.  For  this  reason,  it  is  not  immediately  obvious  how  to  extend  these  two  methods  for 
more  complicated  equation  sets  such  as  the  hydrostatic  primitive  equations  (HPE)  used  in  numerical 
weather  and  climate  modeling  or  the  oceanic  Navier-Stokes  equations. 

In  short,  the  contribution  of  the  current  manuscript  to  the  literature  is  that  we  show  how  to 
build  a  unified  view  of  integration  factors  directly  onto  the  standard  semi-implicit  method.  This 
makes  it  quite  straightforward  to  augment  an  existing  semi-implicit  model  including  highly  complex 
models  that  solve  the  HPE  which  we  reserve  for  a  forthcoming  paper. 


2  Shallow  Water  Equations 


While  the  proposed  time-integrators  that  we  refer  to  as  HELSI  can  be  used  on  any  time-dependent 
system  of  partial  differential  equations  (PDEs) ,  to  simplify  the  exposition  we  shall  apply  them  only 
to  the  shallow  water  equations  (SWE)  on  a  rotating  sphere.  The  reason  for  choosing  this  set  of 
PDEs  is  because  the  spherical  domain  allows  boundary  conditions  to  be  treated  quite  simply  using 
periodicity  and  because  the  SWE  are  a  good  testbed  for  the  construction  of  atmospheric  models  and 
thereby  represent  a  relevant  equation  set. 

The  shallow  water  equations  are  a  system  of  first  order  nonlinear  hyperbolic  equations  which 
govern  the  motion  of  an  inviscid  incompressible  fluid  in  a  shallow  depth.  The  predominant  feature 
of  this  type  of  fluid  is  that  the  characteristic  length  of  the  fluid  is  far  greater  than  its  depth  which 
is  analogous  to  the  motion  of  air  in  the  atmosphere  and  water  in  the  ocean.  For  this  reason  these 
equations  are  typically  used  as  a  first  step  towards  the  construction  of  numerical  weather  prediction 
(NWP),  climate,  and  ocean  models. 

Let  us  write  the  SWE  in  the  following  vector  form 

§  =  &<,)  ID 


with 


Ss(q)  =  - 


V  •  {4m) 

u  ■  Vm  +  “^f{x  x  u)  +  V(0  +  4>s ) 


(2) 


where  the  subscript  E  is  meant  to  remind  the  reader  that  the  equations  are  written  in  an  Eulerian 
reference  frame.  Furthermore 

q  =  {4>,u,v,w)T 


represents  the  state  vector  composed  of  the  geopotential  height,  <f>,  and  the  three  Cartesian  velocity 
components,  ( u,v,w ),  and  4>s  represents  the  surface  topography  (e.g.,  mountains). 

Alternatively,  the  SWE  can  be  written  more  compactly  in  terms  of  the  Lagrangian  derivative 


d 

dt 


dt 


+  u  ■  V 


as  follows 

dq 

dt 

where  the  source  function  is  now  defined  as 


S(q) 


s(q) 


•  u 

?£f{x  x  u)  +  v(0  +  4>s ) 


(3) 

(4) 
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3  Eulerian  Semi-Implicit  Time-Integrators 

The  Eulerian  semi-implicit  (ESI)  discretization  of  Eq.  (1)  is  as  follows 


^L={SE(q)-5L(q)}+S[L(q)] 


(6) 
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where  the  terms  inside  the  curly  brackets  are  time-integrated  explicitly,  those  inside  the  square 
brackets  implicitly,  L  represents  the  linearization  of  S,  and  <5  =  0  or  1  depending  on  whether  a 
purely  explicit  or  semi-implicit  method  is  chosen.  The  linearized  operator  L  is  in  fact  defined  as 
follows 

= -  ( V )  (7) 

where  $  denotes  some  mean  height.  It  should  be  mentioned  that  the  operator  L  does  not  change  from 
the  Eulerian  to  Lagrangian  reference  frame  because  no  linearization  about  the  velocity  field  is  applied 
consistently  with  the  approach  undertaken  for  all  atmospheric  models  that  use  the  hydrostatic 
primitive  equations. 

In  [11]  we  studied  various  Eulerian  semi- implicit  time-integrators  and  determined  that  for  the 
atmospheric  hydrostatic  primitive  equations  (HPE)  the  second  order  backward  difference  formulas 
(BDF2)  are  a  good  choice.  A  straightforward  BDF2  discretization  of  Eq.  (6)  for  <5  =  0  is 

qn+1  =  a0qn  +  aiq"-1  +  7  A  tSE(q)n+1.  (8) 


However,  Eq.  (8)  would  require  solving  an  implicit  nonlinear  problem.  Instead,  we  use  the  splitting 
proposed  by  Karniadakis  et  al.  [12]  to  extrapolate  the  nonlinear  terms  in  SE(q)n+1  with  known 
values  and  solving  the  linear  terms  implicitly;  this  is  the  idea  behind  the  BDF2  semi-implicit  time- 
integrator  which  we  now  describe. 

A  BDF2  semi-implicit  time-integration  of  Eq.  (1)  is  then 


1  2  /  2 

qn+1  =  J2  a^n~m  +7At  PmSE{q)n~m  +  51AtL  ]T  pmqn~m 

m—0  m= 0  \m=— 1 


(9) 


where  the  /?s  are  the  extrapolation  coefficients  and  the  term  inside  L  is  obtained  by  combining  the 
explicit  and  implict  L  in  Eq.  (6).  For  completeness  in  Table  I  we  include  the  BDF2  coefficients; 
note  that  the  scheme  BDF2A  is  the  semi-implicit  time-integrator  proposed  in  [12]  and  BDF2B  was 
proposed  by  Hulsten  (personal  communication)  and  used  for  the  hydrostatic  primitive  equations  in 
[11].  For  all  numerical  experiments  we  use  BDF2B  exclusively  (the  stability  regions  of  BDF2A  and 
BDF2B  can  be  found  in  [11]). 
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Table  I:  Coefficients  of  the  backward  difference  formulas  corresponding  to  Eq.  (9). 

The  reason  why  Eq.  (9)  is  described  as  an  Eulerian  semi-implicit  time-integrator  is  because  all 
of  the  quantities  present  are  constructed  in  an  Eulerian  fashion,  that  is,  q  represents  values  at  a 
specific  grid  point  but  defined  at  the  various  time-levels  ranging  from  n  —  1,  n,  to  n  +  1.  Note  that 
q  in  the  summation  term  Em=o  amQn~m  is  constructed  in  an  explicit  Eulerian  fashion.  This  point 
will  become  much  more  apparent  in  the  next  section  where  we  discuss  the  HELSI  time- integrators. 
We  can  now  simplify  Eq.  (9)  by  extracting  its  fully  explicit  solution  as  follows 

1  2 

gexpIicit  =  £  amqn-m  +  ^  £  pmSE{q)n~m .  (10) 

m= 0  m= 0 

Multiplying  Eq.  9  by  p- 1,  and  adding  E^=0pmq"-m  yields 

Qtt  =Qe+  S-fAtp-iL  (, qtf )  (11) 

where 

+  (12) 

m= 0 
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and 


2  2 

Qtt  =  £  PmQn~m  =  P-iQn+1  +  £  Pmqn~m.  (13) 

m=— 1  m=0 

Eq.  (11)  is  the  actual  expression  used  to  solve  the  semi-implicit  problem.  This  completes  the  con¬ 
struction  of  the  Eulerian  semi-implicit  time-integrator. 

4  Hybrid  Eulerian-Lagrangian  Semi-Implicit  Time-Integrators 

4.1  Review  of  Operator  Integration  Factors 

In  order  to  understand  the  unified  view  of  OIFS  and  SL  it  is  important  to  understand  the  concept 
of  integrating  factors;  this  idea  is  used  for  simplifying  the  solution  of  partial  differential  equations 
(PDEs)  by  essentially  eliminating  one  of  the  terms.  Let  us  write  a  generalized  hyperbolic-parabolic 
PDE  in  the  following  way 

^+u-Vq  =  S  (14) 

where  S  is  a  vector  that  contains  all  the  source  terms.  We  can  now  define  an  integrating  factor, 

Q i  (t)  £  RNxN,  to  simplify  the  original  PDE  to  the  following  form 

(t)  •  q(t)  =  Qe  (t)  ■  S  (15) 

where  N  is  the  number  of  grid  points,  t*  >  t  is  any  time  at  which  we  would  like  to  evaluate  the 
integrating  factor,  and  Qf  (t*)  =  I  where  I  £  rn><n  js  the  identity  matrix.  It  turns  out  (see  [6]) 
that  the  following  equality  can  be  derived 


Qt\i)-q{i)  =  q^’tXt*  -t)  (16) 

which  then  allows  us  to  rewrite  Eq.  (15)  as  follows 

-t)  =  Qt\t)-S  (17) 

where  the  new  variable  q  is  the  solution  to  the  advection  problem 

;t)( s )  =  -u(s)  ■  Vq(t  ]t)(s)  (18) 

for  0  <  s  <  t  *  -t  with  the  initial  condition  q(t  ’^(0)  =  q(t).  Now,  setting  t*  =  tn+l  yields  the  final 
form  of  Eq.  (17) 

^q(*"+1;t")(A  t)  =  S  (19) 

where  the  solution  of  Eq.  (18)  can  be  carried  out  in  any  desired  way.  We  shall  next  explore  using 
the  semi-Lagrangian  (SL)  and  Operator-Integration-Factor  Splitting  (OIFS)  methods.  Comparing 
Eqs.  (14)  and  (19)  it  becomes  immediately  obvious  that  we  could  have  written  the  original  system 
as  follows 

! = s  <2o> 

which  is  the  form  we  shall  use  in  the  next  section. 
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4.2  HELSI 

The  hybrid  Eulerian-Lagrangian  semi-implicit  discretization  of  Eq.  (4)  is 

§  =  {S(q)-SL(q)}  +  S[L(q)]  (21) 

where,  once  again,  the  terms  inside  the  curly  brackets  are  time-integrated  explicitly,  and  those  inside 
the  square  brackets  implicitly.  As  mentioned  previously  the  linearized  operator  L  is  defined  as  in 
the  Eulerian  case,  that  is,  Eq.  (7). 

Applying  HELSI  to  Eq.  (21)  yields 

1  2  /  2 

qn+1  =  E  a^m  +  7At  E  PrnS(q)n-m  +  61AtL  E  p™q 

m= 0  m= 0  \m=— 1 

where  the  tilde  above  q  denotes  that  this  quantity  is  determined  along  characteristics  in  a  semi- 
Lagrangian  sense.  The  reason  why  Eq.  (22)  is  described  as  a  hybrid  Eulerian-Lagrangian  semi- 
implicit  time-integrator  is  because  the  quantities  corresponding  to  the  time  derivatives  (q)  are  com¬ 
puted  in  a  Lagrangian  sense  while  the  remaining  q  corresponding  to  the  source  function  ( S )  and 
its  linearization  (L)  are  computed  in  an  Eulerian  fashion.  Note  that  by  using  BDFs  the  source 
term  does  not  have  to  be  evaluated  along  the  Lagrangian  path  (unlike  in  semi-Lagrangian  centered 
schemes  typically  used  in  NWP  models  [4,  5]).  This  allows  for  a  standard  semi- implicit  model  to  be 
easily  augmented  to  a  Lagrangian  semi-implicit  model;  furthermore,  because  the  source  terms  are 
not  computed  along  trajectories  allows  the  model  to  retain  accuracy  and  conservation. 

Applying  the  exact  same  simplifications  that  were  used  in  ESI  we  can  now  write  the  final  HELSI 
form  as  follows 

qtt=q  +  SjAtp-iL  (qa)  (23) 

where 

2 

q  =  p_lQ™ P'-t  +  J2  pmqn  m 
m= 0 

and 

^explicit  =  amqn  m  +  7A t  E  l%nS 

m= 0  m= 0 

Comparing  Eqs.  (9)  and  (22)  we  see  that  the  only  change  required  to  augment  an  ESI  to  a  HELSI 
is  to  subtract  the  advection  terms  from  the  source  function  Se  and  then  deciding  how  q  will  be 
approximated  along  the  characteristics.  To  reiterate,  what  makes  this  augmentation  so  straightfor¬ 
ward  is  the  BDF-based  construction  of  the  semi-implicit  method.  In  addition,  Eqs.  (23)  through 
(25)  show  that  SL  and  OIFS  methods  may  be  managed  in  the  same  code  simply  by  replacing  q  with 
the  proper  approximation.  In  fact,  our  numerical  results  were  obtained  by  virtue  of  a  single  code. 
Let  us  now  describe  how  to  solve  the  advection  step  in  a  Lagrangian  sense  along  characteristics. 


w 


(24) 

(25) 


4.3  Advection  Step 


To  complete  the  HELSI  discretization  requires  the  construction  of  q  which  are  the  solution  val¬ 
ues  along  the  characteristics.  At  this  point  we  have  a  choice  as  to  which  method  to  use  for  this 
step.  The  choices  are  numerous  but  the  candidates  that  we  shall  explore  in  this  work  are  the 
operation-integration-factor  splitting  (OIFS)  and  semi-Lagrangian  (SL)  methods  (another  choice 
used  previously  is  the  characteristic  Galerkin  [13]).  Regardless  of  which  method  is  selected  the  goal 
is  to  solve  the  Lagrangian  derivative 


(26) 


accurately  and  efficiently. 
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4.3.1  Semi-Lagrangian  Method 


In  the  semi-Lagrangian  (SL)  method  one  simply  replaces  Eq.  (26)  by  the  solution  qn+l  =  q(xn,tn ) 
where  xn  =  x(tn)  is  the  foot  of  the  characteristics  that  departs  from  x(tn)  and  arrives  at  x(tn+1). 
Thus  the  advection  step  with  the  semi-Lagrangian  method  reduces  to  computing  the  departure  point 
via  the  particle  trajectory  equation 


(27) 


There  are  various  ways  of  solving  Eq.  (27)  but  typically  it  is  carried  out  via  the  implicit  midpoint 
rule  (which  requires  iterations)  or  by  Runge-Kutta  (RK)  methods.  For  example,  using  the  second 
order  Runge-Kutta  (RK2)  method  yields 


~n+l/2 


X 


n 


xn+1  -  —un+1 
2 

xn+l  -  A tun+l/2 


(28) 


where  x"  =  x(tn)  and  un+1^'2  =  u(x(tn+1^2),tn+1^2).  Because  the  velocity  field  is  known  only  at 
integer  time-levels  (e.g.,  n-1,  n,  n+1)  then  to  get  the  velocity  at  intermediate  times  (e.g.,  n-1/2  and 
n-f-1/2)  requires  interpolation  and  extrapolation. 


Figure  1:  Schematic  of  the  semi-Lagrangian  trajectory  computation.  The  thick  line  is  the  actual 
trajectory  while  the  arrows  denote  the  path  followed  to  compute  the  departure  points  using  a  RK2 
approximation. 


Once  the  departure  point  is  computed  we  then  have  to  interpolate  to  get  qn  =  q(xn,  tn)  because 
xn  will  not  generally  fall  on  a  grid  point.  However  looking  at  Eq.  (22)  we  see  that  we  not  only  need 
qn  but  also  q 0-1 .  Thus  we  apply  Eq.  (28)  one  more  time  to  get 


xn-l/ 2 


X 


xn  1  =  xn-A  tun-1/2 

which  is  then  used  to  compute  q n~1  =  q(£"_1,  t71^1). 

A  few  things  can  be  immediately  discerned  about  the  SL  method: 

1 .  The  departure  point  calculation,  Eq.  (28) ,  is  insensitive  to  the  dimension  of  q. 


(29) 
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2.  The  SL  method  becomes  increasingly  more  cost-effective  with  increasing  time-step. 

3.  Because  the  departure  point  values,  q,  are  computed  via  interpolation  this  then  allows  mono¬ 
tonicity  to  be  incorporated  into  the  interpolation  method.  One  common  practice  is  to  introduce 
a  quasi-flux-corrected  transport  (FCT)  method. 

4.  Some  structure  in  the  grid  must  exist  in  order  to  search  efficiently  for  the  departure  points  x 
which  are  then  used  to  interpolate  q. 

5.  An  interpolation  stencil  must  be  readily  available  which  requires  either  structure  in  the  grid, 
local  high-order  interpolation  functions,  or  an  easily  computable  polynomial  basis  construction 
of  the  discrete  spatial  operators. 

6.  On  a  distributed-memory  computer  the  departure  point  of  a  specific  grid  point  may  lie  off- 
processor. 

Items  1,  2,  and  3  are  the  virtues  of  the  SL  method  while  items  4,  5,  and  6  are  its  vices.  Item  1  is 
particularly  important  for  systems  of  equations  with  multiple  solution  variables.  An  example  is  a 
transport  model  which  may  carry  up  to  50  tracers. 

Since  the  cost  of  computing  the  trajectory  and  then  interpolating  is  not  dependent  on  the  time- 
step  size  (item  2)  then  for  pure  advection  problems  the  SL  method  becomes  increasingly  more  cost 
effective  with  increasing  time-step.  However,  for  more  complex  systems  the  condition  number  of  the 
matrix  (resulting  from  the  semi-implicit  approach)  increases  with  increasing  time-step  which  will 
adversely  affect  the  efficiency  of  the  overall  solution  approach.  Item  3  allows  monotonicity  to  be 
built  directly  into  the  interpolation  stencil  as  has  been  done  in  [14,  15,  16,  9]. 

Item  4  says  that  the  limitation  of  the  SL  method  is  that  grid  structure  must  exist  if  efficient  search 
and  interpolation  operators  are  to  be  constructed.  In  [16,  17,  9]  SL  formulations  on  unstructured 
grids  were  successfully  implemented  but  constructing  efficient  search  algorithms  on  unstructured 
grids  for  general  applications  remains  an  open  issue. 

Item  5  implies  that  either  there  must  exist  some  structure  in  the  grid  in  order  to  construct 
global  interpolation  functions  (e.g.,  cubic  splines)  or  local  high-order  interpolation  functions  must  be 
readily  available.  High-order  interpolation  functions  are  available,  for  example,  if  spectral  elements 
are  used  to  approximate  the  spatial  derivatives.  This  idea  was  exploited  in  [16,  17,  9]  to  construct 
high-order  SL  methods  on  unstructured  grids.  However,  in  these  works  the  numerical  method 
used  to  approximate  the  spatial  derivatives  were  constructed  from  an  easily  computable  polynomial 
expansion  (Legendre  and  Proriol-Koornwinder-Dubiner  polynomials);  such  a  polynomial-derived 
basis  expansion  may  not  always  be  available.  An  example  where  non-polynomial  bases  arise  is  in 
the  prolate  spheroidal  wave  functions  described  in  [18]  and  the  bubble-type  ad  hoc  high  order  finite 
elements  derived  in  [19]  both  of  which  we  have  explored  in  the  context  of  constructing  accurate  and 
efficient  nonlinear  models.  For  these  types  of  expansion  bases  OIFS  is  the  only  recourse. 

Perhaps  the  worse  vice  of  the  SL  method  (item  6)  is  that  on  distributed-memory  computers  the 
departure  point  calculation  may  require  communication  across  processors  (something  that  should 
be  minimized  for  the  sake  of  efficiency)  because  departure  points  may  lie  off-processor.  Let  us  next 
explore  a  second  option  for  computing  the  departure  point  values. 

4.3.2  Operator-Integration-Factor  Splitting  Method 

In  the  operator-integration-factor  splitting  (OIFS)  method  [6]  the  Lagrangian  derivative  is  computed 
using  Eulerian  substepping,  that  is,  the  goal  is  to  solve 

If  =  -»  •  V,  (30) 

with  q(s  =  0)  =  qn  for  the  approximation  of  qn  and  q(s  =  0)  =  qn~l  for  computing  g"-1.  To  see 
how  the  OIFS  method  is  applied  let  us  assume  that  we  wish  to  solve  Eq.  (30)  using  RK2.  We  then 


7 


get 


^■n+1/2 

Q  i 


sr+1 


q"  -  — «"  •  Vq" 

q”  -  At«"+1/2  •  Vq"+1/2 


(31) 


and 


q"_1//2  =  q"-1  -  •  Vq"-1 

q”  =  qn~l  -  A  tit"-1/2  •  Vqr1/2 

-^n+l/2  At  r-r^n 

q2  =92-  ~yu  •  v<?2 

q”+1  =  q"  -  Aftzra+1/2  •  Vq2+1/2  (32) 


where  q'!  =  q"+1  and  q'!_1  =  q"+1.  In  addition,  the  velocity  field  is  defined  as  un+1^2  = 
u(x(tn+1),tn+1/2)  and  is  interpolated/extrapolated  from  known  time  levels  but  the  difference  from 
the  SL  method  is  that  this  velocity  is  now  Eulerian  since  it  is  defined  at  the  arrival  point  x(tn+1). 
Note  that  the  velocity  field  u  is  still  somewhat  akin  to  a  velocity  along  characteristics  since  it  is 
not  updated  in  the  RK2  method  -  rather  it  is  interpolated  (for  n-1/2)  and  extrapolated  (for  n+1/2) 
using  it"-1  and  un .  Without  this  key  step  the  OIFS  method  would  not  be  able  to  retain  its  temporal 
accuracy. 


x(t-dt)  x(t)  x(t+dt) 


Figure  2:  Schematic  of  the  operator-integration-factor  splitting  (OIFS)  trajectory  computation. 
The  thick  line  is  the  actual  trajectory  while  the  arrows  denote  the  paths  followed  to  compute  the 
departure  point  values  via  a  RK2  approximation. 


However,  if  this  were  the  complete  story  then  OIFS  would  be  rather  limited.  One  can  increase 
the  maximum  allowable  time-step  without  losing  accuracy  or  stability  in  two  ways:  the  first  is  to 
use  higher  order  RK  methods  such  as  RK3  and  RK4  which  have  larger  stability  regions  than  RK2; 
the  second  is  to  use  substepping  such  that  each  time-step  is  broken  up  into  numerous  smaller  time- 
steps  which  are  then  passed  to  the  RK  method.  Furthermore,  these  two  points  can  be  combined  for 
further  increases  in  the  time- step.  Clearly  there  is  an  efficiency  price  to  be  paid  but  if  the  implicit 
part  of  the  solution  is  very  costly  then  it  makes  sense  to  minimize  the  number  of  implicit  solves  by 
using  a  very  large  time-step  thereby  using  multiple  substeps  in  the  OIFS  method. 
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A  few  remarks  regarding  the  OIFS  method  are  now  in  order: 

1.  The  success  of  the  method  relies  on  solving  an  advection  problem,  Eq.  (30),  efficiently. 

2.  For  transport  models  with  multiple  tracers  one  must  solve  an  OIFS  problem  for  each  individual 
tracer  which  results  in  a  prohibitive  cost. 

3.  No  grid  structure  is  required  thereby  OIFS  can  be  used  just  as  efficiently  on  completely  un¬ 
structured  grids. 

4.  On  a  distributed-memory  computer  the  OIFS  will  scale  because  the  advection  operator  is 
computed  mostly  on-processor;  the  only  off-processor  computation  is  in  the  communication 
required  by  the  spatial  discretization  method  to  either  compute  derivatives  (as  in  finite  element, 
finite  difference,  or  spectral  methods)  or  fluxes  (as  in  finite  volumes,  penalty,  and  discontinuous 
Galerkin  methods). 

5.  Because  the  departure  point  values  rely  not  on  interpolation  as  in  the  SL  method  but  on  solving 
an  advection  operator,  monotonicity  can  only  be  introduced  by  selecting  spatial  discretization 
methods  which  are  monotonicity  preserving. 

While  the  OIFS  method  can  be  applied  in  conjunction  with  any  spatial  discretization  method  (that 
is,  a  method  used  to  approximate  derivatives  such  as  the  gradient  in  Eq.  (30))  item  1  says  that 
some  spatial  discretization  methods  may  be  impractical  to  use  in  an  OIFS  approach.  For  example, 
if  high-order  finite  elements  are  used  which  result  in  a  large  sparse  mass  matrix  then  this  matrix 
would  have  to  be  inverted  at  every  step  of  the  OIFS  approach.  Thus  for  an  RK2  method,  Eqs.  (31) 
and  (32),  one  would  have  to  invert  the  mass  matrix  6  times;  however  if  two  substeps  are  used  then 
this  number  increases  to  12!  Clearly,  spatial  discretization  methods  which  either  do  not  have  a  mass 
matrix  or  have  one  that  is  diagonal  is  critical. 

Item  2  says  that  the  OIFS  approach  is  really  only  practical  for  equation  sets  with  few  ‘solution 
variables.  Examples  include  incompressible  Navier-Stokes,  and  hydrostatic  and  non-hydrostatic 
atmospheric  and  oceanic  equations,  to  name  a  few  relevant  equation  sets.  Transport  models  probably 
are  not  a  good  choice  for  OIFS. 

Item  3  shows  the  clear  advantage  of  OIFS  over  SL  on  todays  cache-based  distributed-memory 
computer  systems.  Each  stage  of  the  RK  advection  problem  will  require  some  form  of  communication 
across  processors  but  unlike  in  the  SL  method  load  balancing  is  guaranteed  since  each  processor  will 
always  perform  the  same  amount  of  work. 

Item  4  may  be  viewed  as  either  a  virtue  or  a  vice  depending  on  how  flexible  the  model  is 
with  regard  to  changing  the  spatial  discretization  methods.  For  example  if  one  is  using  continuous 
numerical  methods  (such  as  finite  difference,  finite  elements,  or  spectral  methods)  then  it  will  be 
quite  difficult  to  ensure  monotonicity  of  the  solution  unless  some  other  mechanism  is  applied  (such 
as  flux-corrected  transport  or  some  variant  thereof)  to  eliminate  oscillations  and  over/under  shoots 
at  the  base  of  the  discontinuity.  However,  if  the  discretization  lends  itself  to  be  altered  quite  easily 
then  one  can  change  the  spatial  discretization  for  the  advection  problem  to  one  which  can  ensure 
monotonicity  and  then  switch  back  to  the  original  method  for  the  implicit  solve.  For  example  one 
could  use  finite/spectral  elements  for  the  implicit  solve  and  discontinuous  Galerkin  (DG)  for  the 
advection  problem  [20,  8].  The  advantage  of  using  DG  methods  for  the  advection  step  is  that  one 
can  then  incorporate  Riemann  solvers  or  upwinding  flux  functions  with  total  variation  diminishing 
(TVD)  schemes  to  eliminate  any  over/under  shoots  and  spurious  oscillations  in  the  vicinity  of  shocks 
(see  [21,  22,  23]). 

4.3.3  Similarity  between  OIFS  and  SL  Methods 

Comparing  the  SL  and  OIFS  methods  one  can  see  the  strong  similarities  between  the  two  methods. 
To  compare  the  two  methods  we  only  need  to  look  at  the  ID  advection  equation  which  then  simplifies 
to  the  following  Lagrangian  solution 

qn+1  =q(xn,tn).  (33) 
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we  get  for  the 


Taking  Eq.  (28)  and  expanding  in  a  Taylor  series  about  the  arrival  point,  xn+1 , 
velocity 

u(xn+1/2,tn+1/2)  =  un+1/2  -  ^ ununx+1/ 2  +  0(Af2) 
where  un+ 1//2  =  u(xn+1,tn+1/2).  The  departure  point  can  now  be  written  as  follows 

A/2 

xn  =  xn+1  -  A tun+1/2  +  —  unu™+1/2  +  0(At3).  (35) 

Incorporating  this  expansion  for  the  departure  point  value  yields 

q"  =  qn-  A tun+1/2q™  +  ^  (unUnx+X/2qnx  +  un+1/2un+1/2q +  0(Af3).  (36) 

In  contrast,  a  straight  application  of  the  OIFS  RK2  method  given  in  Eq.  (31)  yields 

qn+l  =qn  _  Atun+l/2_jl_  (qn  _  (37) 

which,  after  differentiating  and  grouping  terms,  yields 

q"  =qn  -  Atun+1'2qnx  +  ^  (un+1/2u™q™  +  un+1/2unq +  0(At3).  (38) 

Comparing  Eqs.  (36)  with  (38)  shows  that  there  is  little  difference  between  the  SL  and  OIFS  methods 
provided  that  the  same  time- integrator  used  for  the  trajectory  computation  of  the  SL  method  is  used 
for  the  advection  problem  of  the  OIFS  method.  It  should  be  kept  in  mind  that  the  Taylor  series 
expansion  used  in  the  SL  derivation  is  only  an  approximation  since  in  reality  the  SL  method  uses 
interpolation  to  obtain  the  departure  point  values.  Therefore,  it  is  not  immediately  obvious  that  any 
useful  information  can  be  obtained  by  such  a  simple  analysis.  However,  it  turns  out  that  numerically 
both  the  OIFS  and  SL  methods  behave  similarly  provided  that  the  same  time-integrators  are  used. 
We  have  confirmed  this  for  RK2,  RK3,  and  RK4.  To  be  clear,  the  difference  in  accuracy  between 
the  two  methods  is  minimal.  The  main  difference  is  in  the  stability  regions  of  the  two  methods; 
the  SL  method,  being  a  true  Lagrangian  method,  avoids  the  CFL  condition  and  thereby  can  use  far 
larger  time-steps  than  OIFS.  In  the  next  section  we  compare  the  results  of  both  methods. 


5  Results 


For  the  numerical  experiments,  we  use  the  normalized  1/2  error  norm 


\\<P\\l2 


inexact  ~ 
ffl  ‘/’exact  dfi 


to  judge  the  accuracy  of  the  methods.  To  compute  the  Courant  number  the  elements  are  decomposed 
into  their  high-order  (HO)  grid  points  and  these  grid  points  form  a  fine  grid  which  we  refer  to  as 
the  HO  cells.  The  velocities  and  grid  spacings  are  then  defined  at  the  centers  of  these  cells.  Using 
these  definitions  the  Courant  number  is  then  defined  as 


C  =  max 


AAA' 
As  J 


Vee  [1  ,...,Ne] 


HO 


where 


U  for  case  1 

U  +  \f4>  for  all  other  cases 


where  A  is  the  characteristic  speed,  U  =  \Ju  ■  u.  and  As  =  \J Ax2  +  A y2  +  Az1  is  the  grid 


spacing. 
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Four  test  cases  are  used  to  judge  the  performance  of  the  HELSI  time-integrators.  Two  types  of 
grids  used  are  shown  in  Fig.  3.  Figure  3a  shows  the  hexahedral  grid  which  is  a  structured  grid  and  Fig. 
3b  shows  the  icosahedral  grid  which  is  an  unstructured  grid;  both  grids  have  approximately  the  same 
number  of  points  and  both  use  8th  order  polynomials.  We  use  the  hexahedral  grid  almost  exclusively 
in  this  paper  because  it  is  a  structured  grid  and,  therefore,  is  quite  easy  to  construct  efficient  search 
algorithms  (see  [9])  so  that  the  SL  method  can  be  used  efficiently  (for  search  algorithms  on  icosahedral 
grids  see  [24]).  However,  the  OIFS  method  does  not  care  what  the  grid  looks  like  and  so  either  the 
hexahedral  or  the  icosahedral  grid  can  be  used  just  as  efficiently.  For  all  of  the  problems  studied 
here,  however,  the  hexahedral  grid  yields  more  accurate  results  because  the  grid  is  aligned  with  the 
principle  direction  of  the  flow  (east  to  west). 


a) 


b) 


Figure  3:  The  a)  hexahedral  and  b)  icosahedral  grids  with  38,402  and  34,562  grid  points  respectively. 


5.1  Case  1:  Passive  Advection  of  a  Cosine  Wave 

Case  1  concerns  the  solid  body  rotation  of  a  cosine  wave.  The  velocity  field  remains  unchanged 
throughout  the  computation.  This  test  case  is  integrated  for  12  days  which  corresponds  to  one 
complete  revolution  of  the  wave  (see  [25]  for  its  definition) . 


a) 


Courant  Number 


•  RK2 
RK3 

•  RK4 


10 

Courant  Number 


b) 


Figure  4:  Case  1.  The  </>  L2  error  norm  as  a  function  of  Courant  number  for  a  grid  consisting  of  6 
elements  each  of  16th  order  for  the  a)  SL  and  b)  OIFS  methods. 


Figure  4  shows  the  errors  for  a  12  day  integration  for  6  elements  tiling  the  sphere  each  of 
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which  is  16th  order  accurate  for  a  total  of  1538  grid  points  (for  description  of  the  spectral  element 
discretization  see  [9])  .  In  Fig.  4a  we  see  that  the  SL  method  with  RK4  yields  better  results  than 
RK2  and  RK3  for  increasing  time-step.  Above  Courant  numbers  of  4  the  SL  RK2  method  loses 
accuracy  quite  rapidly.  In  contrast,  SL  RK3  retains  its  accuracy  up  to  Courant  numbers  of  11  at 
which  point  it  too  loses  its  accuracy  exponentially.  Finally,  SL  RK4  does  quite  well  even  for  Courant 
numbers  as  large  as  50;  however,  for  such  large  Courant  numbers  the  SL  method  begins  to  lose  its 
accuracy  and  should  therefore  not  be  used  for  this  range  of  Courant  numbers  for  this  particular  test 
case. 

Figure  4b  shows  the  results  for  OIFS  with  RK2,  RK3,  and  RK4.  The  OIFS  RK4  gives  the  best 
results  but  OIFS  RK3  yields  very  comparable  results.  The  OIFS  RK2  results  are  not  as  good  as  those 
for  RK3  and  RK4.  However,  the  one  striking  difference  between  the  various  OIFS  and  SL  methods 
is  that  all  the  OIFS  methods  give  somewhat  similar  results  (of  the  same  order  of  magnitude).  The 
reason  for  this  is  simple;  for  an  advection  operator  one  can  get  the  OIFS  method  to  yield  a  solution 
for  any  size  time-step.  All  one  needs  to  do  is  increase  the  number  of  substeps  per  time-step.  Thus 
in  the  results  shown  here  the  OIFS  RK2  was  able  to  run  with  Courant  numbers  as  large  as  50  but 
by  using  hundreds  of  substeps.  Because  one  can  increase  the  number  of  substeps  then  OIFS  can  use 
very  large  time-steps  without  incurring  too  large  a  loss  in  accuracy.  However,  as  we  will  see  in  the 
next  section  using  numerous  substeps  will  penalize  the  overall  efficiency  of  the  method  and  therefore 
should  be  minimized  whenever  possible. 

5.2  Case  2:  Steady-State  Nonlinear  Zonal  Geostrophic  Flow 

This  case  is  a  steady-state  solution  to  the  nonlinear  shallow  water  equations.  The  equations  are 
geostrophically  balanced  and  remain  so  for  the  duration  of  the  integration  where  the  velocity  field 
remains  constant  throughout  the  computation.  The  geopotential  height  <j>  undergoes  a  solid  body 
rotation  but  since  the  initial  height  field  is  given  as  a  constant  band  in  the  zonal  direction  (left 
to  right)  and  the  flow  field  is  purely  zonal,  then  the  solution  remains  unchanged  throughout  the 
time-integration.  The  velocity  field  is  the  same  as  that  used  in  case  1  and  we  integrate  this  test  for 
5  days  (see  [25]  for  its  definition). 


a)  b) 


Figure  5:  Case  2.  The  <f>  L2  error  norm  as  a  function  of  Courant  number  for  a  grid  consisting  of  384 
elements  each  of  8th  order  for  the  a)  SL  and  b)  OIFS  methods. 


Figure  5  shows  the  errors  after  a  5  day  integration  for  384  elements  tiling  the  sphere  each  of 
which  is  8th  order  accurate  for  a  total  of  24,578  grid  points.  In  Fig.  5a  we  see  that  the  SL  method 
with  RK4,  once  again,  yields  better  results  than  RK2  and  RK3  for  increasing  time-step.  However, 
above  Courant  numbers  of  15  all  three  SL  methods  lose  their  accuracy  and  yield  similar  results. 

In  contrast,  Fig.  5b  shows  that  all  three  OIFS  methods  yield  very  similar  results  for  all  Courant 
number  values.  In  fact,  comparing  Figs.  5a  and  5b  one  can  see  that  all  three  OIFS  methods  yield 
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results  quite  similar  to  the  SL  RK4  method.  Again,  it  is  not  suprising  that  the  OIFS  methods  give 
similar  results  since  it  allows  for  substepping.  In  other  words,  the  OIFS  RK2  has  a  smaller  stability 
region  than  the  OIFS  RK3  and  RK4  methods.  Thus,  in  order  to  use  OIFS  RK2  with  a  Courant 
number  of  15  requires  8  substeps  per  time-step.  Clearly,  the  number  of  substeps  used  will  impact 
the  overall  efficiency  of  the  scheme. 

One  final  comment  on  the  results  presented  in  Fig.  5.  Falconi  and  Ferretti  [26]  showed  that  the 
error  of  the  semi-Lagrangian  method  is  of  the  order 

o( >  +  w 

where  K  and  N  represent  the  order  of  the  trajectory  approximation  and  the  order  of  the  interpolation 
functions,  respectively.  In  other  words  for  the  trajectory  equation,  Eq.  (27),  K  represents  the  order 
of  accuracy  of  the  numerical  scheme  used  to  solve  this  equation  while  N  represents  the  order  of  the 
interpolation  functions  used  to  compute  the  variables  at  the  feet  of  the  characteristics  (i.e.,  departure 
points).  The  importance  of  this  analysis  is  that  it  then  implies  that  for  the  SL  method  there  exists 
an  optimal  time-step  for  a  given  grid  spacing.  In  Fig.  5a  we  see  that  there  is  indeed  an  optimal 
time-step  and  it  corresponds  to  a  Courant  number  of  approximately  13.  Looking  now  at  Fig.  5b 
we  see  that  the  optimal  Courant  number  for  the  OIFS  method  is,  once  again,  13.  In  general  we  do 
not  expect  the  optimal  time-steps  of  both  the  SL  and  OIFS  methods  to  coincide;  however,  what  is 
interesting  is  that  the  OIFS  method  also  seems  to  follow  the  convergence  pattern  given  by  Eq.  (5) 
where  N  now  represents  the  order  of  accuracy  of  the  discrete  spatial  operators  (such  as  the  gradient 
operator  required  in  the  advection  problem)  and  K  the  order  of  the  time-integrator  used  to  solve 
the  advection  step.  Let  us  now  turn  to  the  computational  cost  of  the  SL  and  OIFS  methods. 


a)  b) 


Figure  6:  Case  2.  The  computational  cost  in  seconds  as  a  function  of  Courant  number  for  a  grid 
consisting  of  384  elements  each  of  8th  order  for  the  a)  SL  and  b)  OIFS  methods. 


Figure  6  shows  the  computational  costs  associated  with  the  results  shown  in  Fig.  5  for  the  SL 
and  OIFS  methods.  Not  suprisingly,  Fig.  6a  shows  that  the  SL  RK4  method  is  the  most  costly. 
However,  the  cost  of  all  three  SL  methods  drops  with  increasing  time-step.  This  too  is  to  be  expected 
since  the  cost  of  the  SL  method  per  time-step  is  constant  regardless  whether  the  time-step  is  small 
or  very  large.  Thus  for  a  pure  advection  problem,  the  cost  would  continue  to  decrease  for  increasing 
time-step.  The  reason  why  the  cost  plateaus  is  because  the  number  of  iterations  required  by  the 
implicit  solver  to  converge  increases  with  increasing  time-step. 

Figure  6b,  on  the  other  hand,  shows  that  the  OIFS  RK4  actually  costs  less  even  though  it  is 
higher  order  than  RK2  and  RK3.  Recall,  that  the  only  reason  why  OIFS  RK2  and  RK3  can  use 
Courant  numbers  nearing  15  is  because  substepping  is  used.  The  higher  efficiency  of  OIFS  RK4 
is  due  to  the  fact  that  the  RK4  method  requires  fewer  substeps  to  be  taken.  For  example,  for  the 
largest  Courant  number  shown  (17)  the  OIFS  RK2  required  8  substeps  while  the  OIFS  RK3  required 
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only  3  substeps,  and  OIFS  RK4  only  2.  An  interesting  difference  between  the  SL  and  OIFS  methods 
is  that  the  SL  method,  generally,  becomes  more  efficient  as  the  time-step  is  increased  whereas  the 
OIFS  method  clearly  seems  to  have  an  optimal  time-step  range.  For  this  specific  test  case  the  range 
seems  to  be  somewhere  around  a  Courant  number  of  7.  However,  as  the  time-step  is  increased  the 
efficiency  of  SL  RK2  begins  to  approach  that  of  the  OIFS  methods.  In  short,  SL  RK2  and  OIFS 
RK4  are  the  best  compromises  between  accuracy  and  efficiency;  we  compare  these  two  methods  in 
Fig.  7.  These  are  the  two  methods  which  we  shall  study  in  depth  in  the  next  two  sections. 


Figure  7:  Case  2.  The  computational  cost  in  seconds  as  a  function  of  Courant  number  for  a  grid 
consisting  of  384  elements  each  of  8th  order  for  the  SL  RK2  (solid  line)  and  the  OIFS  RK4  (dashed 
line)  methods. 


5.3  Case  3:  Zonal  Flow  over  an  Isolated  Mountain 

This  case  uses  the  same  initial  conditions  as  case  2  with  the  addition  of  a  conical  mountain  at 
(A ,ip)  =  (180,30).  Due  to  the  zonal  flow  impinging  on  the  mountain,  various  wave  structures  form 
and  propagate  throughout  the  sphere.  This  test  is  run  for  10  days  (see  [25]  for  details).  Figure  8 
shows  the  computational  cost  for  a  10  day  integration  for  384  elements  tiling  the  sphere  each  of 
which  is  8th  order  accurate  for  a  total  of  24,578  grid  points. 

Figure  8  compares  the  computational  cost  of  SL  RK2  and  OIFS  RK4.  Note  that  the  efficiency 
plot  of  both  methods  looks  somewhat  erratic  but  as  we  shall  see  the  results  make  perfect  sense. 
For  SL  RK2  (solid  line)  the  advection  step  cost  remains  constant  regardless  of  the  time-step  size; 
this  explains  the  decrease  in  computational  cost  for  increasing  Courant  number.  However,  observe 
that  the  cost  associated  with  C=12  increases.  This  is  due  to  a  very  large  jump  in  the  number  of 
iterations  required  by  the  iterative  solver  for  convergence.  For  C=10  only  68  iterations  are  required 
whereas  for  C=12  that  number  jumps  to  89.  For  the  largest  Courant  number  used  (C=19)  that 
number  further  increases  to  141  iterations  which  explains  why  the  cost  increases. 

For  the  OIFS  RK4  (dashed  line)  there  is  a  balance  between  the  Courant  number,  the  number 
of  iterations,  and  the  number  of  substeps.  For  C  <  10  the  OIFS  RK4  only  requires  one  substep; 
however,  beyond  this  Courant  number  it  requires  2  substeps.  This  is  the  reason  why  the  cost  increases 
for  C=12  because  the  time-step  has  not  increased  by  much  between  C=10  and  C=12  but  now  twice 
the  work  has  to  be  done  for  the  advection  step,  not  to  mention  that  the  number  of  iterations  has 
also  increased  from  70  (for  C=10)  to  89  (for  C=12). 
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Figure  8:  Case  3.  The  computational  cost  in  seconds  as  a  function  of  Courant  number  for  a  grid 
consisting  of  384  elements  each  of  8th  order  for  the  SL  RK2  (solid  line)  and  the  OIFS  RK4  (dashed 
line)  methods. 


5.4  Case  4:  Balanced  Zonal  Jet 

This  test  case  consists  of  a  zonal  jet  and  an  unperturbed  balanced  initial  geopotential  height  field. 
The  balanced  initial  field  should  be  maintained  indefinitely  but  Galewsky  et  al.  [27]  suggest  running 
the  case  for  5  days.  This  is  a  rather  stringent  test  of  shallow  water  models  because  if  the  accuracy 
and/or  the  resolution  is  not  sufficiently  high  then  the  model  will  not  be  able  to  sustain  the  balanced 
initial  field  and  the  error  will  increase  quite  rapidly.  In  addition,  because  the  jet  is  zonally  positioned, 
then  any  grid  that  is  not  aligned  with  the  zonal  direction  will  have  much  more  difficulty  maintaining 
the  jet. 


Figure  9:  Case  4.  The  a)  <j>  L2  error  norm  and  b)  computational  cost  in  seconds  as  a  function  of 
Courant  number  for  the  SL  RK2  and  OIFS  RK4  methods  using  a  grid  consisting  of  600  elements 
each  of  8th  order. 


Figure  9  shows  the  errors  and  computational  cost  for  a  5  day  integration  for  600  elements  tiling 
the  sphere  each  of  which  is  8th  order  accurate  for  a  total  of  38,402  grid  points.  Figure  9a  shows  that 
the  SL  method  can  sustain  high  accuracy  for  extremely  large  time-steps.  In  fact,  for  this  test  case 
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the  SL  method  maintains  high  accuracy  for  Courant  numbers  of  25.  In  contrast,  the  OIFS  method 
yields  high  accuracy  for  Courant  numbers  below  7.  As  the  Courant  number  is  increased  the  OIFS 
method  cannot  compete  with  the  SL  method  in  terms  of  accuracy  for  such  large  time-steps. 

Figure  9b,  however,  shows  that  the  OIFS  method  even  with  moderately  large  Courant  numbers 
(C=7)  is  more  efficient  than  the  SL  method  using  Courant  numbers  of  25.  For  Courant  numbers 
below  7  both  methods  only  require  38  iterations  per  time-step  for  the  GMRES  solver;  however,  the 
SL  method  using  a  Courant  number  of  25  requires  over  200  iterations!  Even  though  the  SL  method 
is  taking  a  large  time-step,  this  large  time-step  is  affecting  the  condition  number  of  the  matrix  which 
then  requires  a  signficant  amount  of  iterations.  It  should  be  mentioned  that  implementing  state-of- 
the-art  preconditioners  may  reduce  the  number  of  iterations;  this  should  be  further  studied  but  is 
beyond  the  scope  of  the  current  work. 

To  show  one  of  the  advantages  of  the  OIFS  method  we  now  run  the  simulation  on  an  icosahedral 
grid.  Recall  that  this  grid  represents  an  unstructured  grid  which  poses  no  difficulties  to  the  OIFS 
method;  for  the  SL  method  one  would  have  to  devise  a  new  search  strategy  and  in  order  to  do 
it  efficiently  requires  exploiting  the  grid  geometry.  While  this  is  not  impossible  to  do  for  the  SL 
method  it  does  mean  that  whenever  a  new  grid  is  used  special  care  must  be  taken  in  order  to 
construct  efficient  algorithms.  It  is  also  possible  to  construct  generalized  search  algorithms  but 
typically  they  are  not  as  efficient  as  ad  hoc  methods.  However,  the  point  here  is  that  the  OIFS 
method  needs  no  modification  or  optimization  whatsoever  whenever  a  new  grid  is  introduced. 

In  Figs.  10  and  11  we  show  the  geopotential  height,  <f>,  on  the  icosahedral  and  hexahedral  grids. 
These  results  use  8th  order  polynomials  and  a  Courant  Number  of  7  which,  for  this  test  case,  is 
a  factor  of  six  larger  than  possible  with  the  standard  semi-implicit  method.  The  icosahedral  grid 
contains  a  total  of  34,562  grid  points  while  the  hexahedral  grid  contains  38,402.  The  right  panels  of 
these  figures  show  that  the  zonal  jet  is  maintained  for  the  duration  of  the  5  day  integration. 


Figure  10:  Case  4.  The  icosahedral  grid  (left  panel)  and  <j>  contours  (right  panel)  for  a  5  day 
simulation.  The  simulation  uses  8th  order  polynomials  and  a  Courant  Number  of  7;  the  view  is 
looking  down  on  the  North  Pole. 


Figure  11:  Case  4.  The  hexahedral  grid  (left  panel)  and  <p  contours  (right  panel)  for  a  5  day 
simulation.  The  simulation  uses  8th  order  polynomials  and  a  Courant  Number  of  7;  the  view  is 
looking  down  on  the  North  Pole. 
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6  Conclusions 


A  hybrid  Eulerian-Lagrangian  semi-implicit  (HELSI)  time- integrator  has  been  presented.  HELSI 
essentially  builds  the  semi-Lagrangian  (SL)  and  operator-integration-factor  splitting  (OIFS)  method 
into  the  standard  formulation  of  the  semi-implicit  method.  This  approach  makes  it  very  simple 
to  augment  a  standard  Eulerian  semi-implicit  method  to  a  Lagrangian  formulation;  this  increases 
the  efficiency  of  the  code  by  allowing  the  model  to  use  very  large  time-steps.  Furthermore,  the 
construction  of  HELSI  unifies  the  SL  and  OIFS  methods  into  a  single  formulation.  The  advantage 
of  this  is  that  the  similarity  of  the  two  methods  can  be  appreciated  and  only  a  single  computer  code 
needs  to  be  maintained.  The  SL  and  OIFS  methods  have  their  strengths  and  weaknesses  which 
we  have  outlined  and  discussed.  While  the  results  shown  here  for  the  shallow  water  equations  are 
quite  impressive  it  remains  to  be  seen  whether  such  large  time-steps  can  be  taken  for  more  complex 
equation  sets  such  as  the  primitive  atmospheric  equations  which  we  plan  to  address  in  future  work. 
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